Floating-Point Format Inference in Mixed-Precision
نویسنده
چکیده
In this article, we address the problem of determining the minimal precision on the inputs and on the intermediary results of a program containing floating-point computations in order to ensure a desired accuracy on the outputs. The first originality of our approach is to combine a forward and a backward static analysis, done by abstract interpretation. The backward analysis computes the minimal precision needed for the inputs and intermediary values in order to have a desired accuracy on the results, specified by the user. The second originality is to express our analysis as a set of constraints made of first order predicates and affine integer relations only, even if the analyzed programs contain non-linear computations. These constraints can be easily checked by an SMT Solver. In practice, the information collected by our analysis may help to optimize the formats used to represent the values stored in the floating-point variables of programs or to select the appropriate precision for sensors. A prototype implementing our analysis has been realized and experimental results are presented.
منابع مشابه
Mixed-precision Fused Multiply and Add
The standard floating-point fused multiply and add (FMA) computes R=AB+C with a single rounding. This article investigates a variant of this operator where the addend C and the result R are of a larger format, for instance binary64 (double precision), while the multiplier inputs A and B are of a smaller format, for instance binary32 (single precision). With minor modifications, this operator is...
متن کاملFlexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks
Deep neural networks are commonly developed and trained in 32-bit floating point format. Significant gains in performance and energy efficiency could be realized by training and inference in numerical formats optimized for deep learning. Despite advances in limited precision inference in recent years, training of neural networks in low bit-width remains a challenging problem. Here we present th...
متن کاملAutomated Floating-Point Precision Analysis
Title of dissertation: Automated Floating-Point Precision Analysis Michael O. Lam, Doctor of Philosophy, 2014 Dissertation directed by: Professor Jeffrey K. Hollingsworth Department of Computer Science As scientific computation continues to scale upward, correct and efficient use of floating-point arithmetic is crucially important. Users of floating-point arithmetic encounter many problems, inc...
متن کاملRethinking Numerical Representations for Deep Neural Networks
With ever-increasing computational demand for deep learning, it is critical to investigate the implications of the numeric representation and precision of DNN model weights and activations on computational efficiency. In this work, we explore unconventional narrow-precision floating-point representations as it relates to inference accuracy and efficiency to steer the improved design of future D...
متن کاملMulti-mode Floating Point Multiplier with Parallel Operations
Most modern processors have hardware support for single precision and double precision floating point multiplication. For many scientific computations like climate modeling, computational physics and computational geometry this support is inadequate. They impose the use of quadruple precision arithmetic because it provides twice the precision of double precision format. The proposed design perf...
متن کامل